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ABSTRACT 


Optimization of power consumption in a real time embedded systems is a critical design parameter. As most of low power 
embedded systems are battery operated and have very limited source of power. Therefore the optimization of power 
utilization is discussed in this paper establishing its relationship with execution frequency of core processing unit in 
multitasking environment. The number of processes required to be executed in multitasking embedded systems may vary 
with time for an application, and hence the dynamic power consumed also vary accordingly. Each process in real time are 
defined with respective absolute deadlines and their relative deadlines depend on the number of processes schedules at a 
time. The relative deadline parameter can be used to estimate the required operating frequency of processing unit and can 
be scaled accordingly to optimize the power consumption. The relative deadlines needs to be less than the absolute 
deadline for any process to complete the task successfully. Therefore the operating frequency can be dynamically scaled 


such that the relative deadlines are just within absolute deadlines. 


The hard-real time systems are designed to meet the deadline of all processes under maximum process load, hence the 
operating frequency of the processing unit is chosen much higher to complete make sure all the processes within deadline 
under peak load condition. However, scaling down the frequency under other conditions will improve the thermal stability 


and also optimize the power consumption. 


The analysis of frequency requirement is done for two most widely used scheduling methods, Earliest Deadline First 


(EDF) and Rate Monotonic (RM) scheduling algorithms for comparison. 
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INTRODUCTION 


The performance of multitasking real time systems are measured not only based on completion of the jobs assigned 
but also on the time consumed to complete the jobs successfully. The system is said to be performing better if it 
can complete all the assigned jobs or tasks within a defined deadline with minimum power consumption. The 
aggressive power management techniques are very important in embedded systems specially those which are 


battery operated. 


The design related issues are affected by the time constraints of a system. The real time performance of 
system depends on successful execution of processes within relative deadlines which are decided dynamically 
based on the process load. The hard real time systems are designed with sufficiently high frequency processor to 


meet the deadlines in worst case scenario when all processes are active and scheduled in the system. However, 
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most of the time systems will not be executing all processes, in such scenarios deadlines can be met even by operating at 
much lower frequencies and the by saving the power under such circumstances. There by frequency scaling can be 
implemented to optimize the power consumption. The scaling factor is calculated based on the relative deadlines of 


scheduled processes. 


The general equations related to calculation of various parameters such as absolute deadline, relative deadline, 
frequency scaling factor are discussed and then the comparative analysis is done using EDF and RM scheduling 
algorithms. The few dynamic parameters are logically assumed considering their relationship and impact on overall task 


execution to simulate the real time conditions. 


RELATED WORK 

Frequency Scaling Factor 

A real time embedded system with N number of processes may execute only selected processes under different conditions. 
Considering X is a count of maximum processes that are scheduled in system under peak load condition. The processing 
unit operates at highest possible frequency to achieve Trp (Relative Deadline) less than Tap (Absolute Deadline) for all the 
processes. However if system is working with less number of processes then frequency can be scaled to lower level which 
in turn reduces the power consumption. If in any case Trp of process exceeds its absolute deadline Tap, then frequency of 
operation must be scaled up and if Trp is much lesser than Tap then operating frequency needs to be scaled down. The 


frequency scaling is proportional to difference between Trp and Tap of last process in the process queue. 
fa (Trp = Tap) 1 


If (Trp — Tap) is positive, then frequency is scaled up and if it’s negative then frequency is scaled down. The 
scaling factor is calculated as the ratio of difference between the relative and absolute deadline to relative deadline 


achieved. In multitasking, the process having highest scaling factor is considered. 
k = (Tp- Trp) / Tro 2 
Hence, the new required frequency of operation fr required to meet the deadline is given by equation 3. 
fo=k*f 3 
The frequency scaling algorithm involves following steps. 
e Identify the processes completing the execution within scheduling period but missing the deadline. 


e If all scheduled processes are meeting the deadline and some processes are left out without getting even 


single slot then frequency is arbitrarily scaled up by 25 % and checked for updated status. 
e = The frequency scaling factor is calculated for all processes completing the execution but missing the deadline. 
e Select the highest factor and scale the frequency accordingly. 
e Check if all the processes are meeting the deadline using revised frequency. 


e =6If ‘YES’ then same frequency is maintained for application, otherwise the cycle repeats from step 1 till all 


processes meet the deadline requirements. 


e The lowest operating frequency is computed such that at least one process violates the deadline below that 
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frequency [1]. 


The system requires and additional programs to monitor the change in turnaround time due to dynamic variation 
in number of processes and to calculate the scaling proportion of operating frequency. These processes are to be active 
throughout the operating period of the system. The additional overhead introduced due to these background processes to 


monitor this operation also should be considered during design cycle. 
Frequency Multiplier Techniques 


The frequency multiplier techniques can be used to scale the operating frequency of processing unit if it can be integrated 


as a module within the processor. There are various techniques invented overt the period are listed below. 


e There are many variable frequency multipliers based on energy distributed on harmonics and input multiplier 


[2]. These multiplier circuits can be used to scale the frequency as and when required. 


e Phase Locked Loop can be configured as frequency multiplier which used multiphase voltage controlled 


oscillator [3]. 


e Another technique to scale the frequency is sinusoidal frequency doublers characterized by tuned LC circuits 


[4]. 


e The frequency and voltage of processor can be dynamically changed based on processor utilization factor to 


reduce the overall power consumption [5]. 


e A neural network model technique also can be used to predict the future load on the system and decide the 


scaling factor in advance [6] [7]. 


e Used Driven Frequency Scaling is another efficient and usage based customized frequency scaling technique 


[8]. 


e Learning direct frequency scaling model used Counter Propagation Network (CPN) to predict the behavior of 


task and compute corresponding frequency required [9]. 
Power Performance Impact in Multitasking Systems 


The performance of multitasking real time systems are measured not only based on completion of the jobs assigned but 
also on the time consumed to complete the jobs. The system is said to be performing better if it can complete all the 
assigned jobs or tasks within a defined deadline with minimum power consumption. The aggressive power management 
techniques are very important in embedded systems specially those which are battery operated. Programmable clock gating 
and DVFS are extensively used for power management in SoC technology [10]. Therefore the operating frequency of the 
processor should be scalable to achieve best performance. In case of multitasking systems deadline partitioning with fair 
DVFS (DPF-DVFS) can be used to schedule processes optimally [11]. The real time scheduling in multiprocessor systems 
can also be achieved by deadline partitioning [12] [13] [14] [15]. The optimal energy efficient techniques for real time 


systems having periodic tasks are achieved but there are no optimal algorithm for systems having sporadic tasks. 


Process scheduling in modern technologies like cloud server poses a greater challenge as the heterogeneity is 
involved in both resources used as well as processes to be serviced [16]. Heterogeneous resources are managed by Virtual 


Machine Monitor (VMM) for efficient utilization and optimal energy consumption [17]. The heterogeneous design of 
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processor with multi-core technology leads to load distribution concept to optimize the average energy consumed by each 
core [18]. Partitioning the total task set into subset and permanently binding them to a particular core with separate run 
queue is one method can be implemented in systems having fixed or deterministic tasks set [19][20]. The optimal 
distribution of load can be formulated by heuristic method load distribution and assignment method using combination of 


DVFS and DFS with low latency sleep states [21]. 


The voltage required for the stable operation of processing unit depends on operating frequency. Hence operating 
voltage can be scaled proportional to operating frequency with consideration of minimum voltage requirements. There are 
two basic components viz. static and dynamic power which decides the overall power consumption of processing unit. The 
static power consumption is basically due to leakage current in various circuits in idle state of processor and dynamic 
power is based on clocking frequency while processor is actively execution a process. Due to static power consumption 
and asymptotic execution time the software code execution shows convex energy behavior, i.e., there exists an optimal 
CPU frequency at which energy consumption is minimal. Leakage current is playing more important role as transistor sizes 
have become smaller and threshold voltage levels lower. A decade ago, dynamic power accounted for approximately two- 
thirds of the total chip power. The processes associated with tasks consists for set of instructions to achieve the required 


operation. These instructions can be broadly classified as memory bound, I/O bound and CPU bound. 


Memory bound instructions are used to access (read / write) to and from primary memory of the system. As 
operating speed of memories are much slower compared to processors, the scheduler may force the particular process to 
suspended state and next process starts executing. However if the process being executed is of higher priority compared to 
other processes then it will not release the resources allocated. This may lead to reduction in the overall performance of the 
device. In such cases the process hold the processor till it retrieve the required data from the memory wasting precious time 
of processing unit. Therefore it’s desired to reduce memory bound instructions as much as possible in case of high 


performance systems. To achieve this processors support large set of registers and cache memory architecture. 


I/O bound instructions deals with exchange of data with peripheral inputs or output ports. The response time of 
these instructions depends on the response ability of the device it is trying to communicate. In some cases if the data is 
expected over network or user interfaces then this operation will take considerably large time. However for better 
interactive systems it is necessary to have I/O bound operations to provide proper user interaction and operational 


feedback. 


CPU bound instructions are completely depends on processor for completion of the task and hence are executed 
much faster than other two types of instructions. All the data processing and manipulations are done by such instructions 
and they form large part of program in processing oriented applications. Whereas these instructions are limited in user or 


network interfacing programs where in other two type of instructions dominate. 


The power efficiency of system can be improved by reducing the dynamic power consumption by operating at 
optimal frequency to meet the required deadlines. This also improves thermal efficiency as voltage requirement reduces 
with respect to frequency. DFS reduces the number of instructions executed by the CPU in unit time and hence normally 


used when programs contain mostly an IO bound instructions. 


The same phenomenon of DFS [21] can be used to optimize power consumption in case of multitasking systems 


by scaling the frequency to just meet the required deadlines. As the number of processes in the execution queue increases 
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then operating frequency is also scaled accordingly to meet the deadlines. If the current frequency itself can complete the 
executions of all processes within deadlines, then frequency can be maintained same and can be scaled down to save more 
power. As completion of process execution much before its deadline does not provide any considerable incentives in terms 


of system performance. 


The “Advanced Configuration and Power Interface (ACPI)”, provides an open standard operating system to 
discover and configure hardware components of computer [22]. In most of modern CPU “P” — states are defined which 
allow clock rate reduction and “T” states (throttling states) which allow to throttle down actual clock rate. Using ACPI 
power management comes under OS instead of BIOS which relied on platform specific firmware to determine power 


management and configuration policies. 


The MATLAB programs are designed to calculate relative deadline with respect to process ID and plot the 
variation. As general computer systems or laptops does not allow user to change the operating frequency of the processor 
directly using APIs due to safety reasons. However, the requirement of scaling can be analyzed based on variation of 
relative deadlines. If all processes are completed well within the required deadline then frequency can be scaled down and 


if any process is missing the deadline then frequency of operation need to be scaled up proportionately. 
Simulation of Frequency Scaling Algorithm 


The frequency scaling is simulated in MATLAB considering five set of processes with both EDF and RM scheduling 
principles. Each process is characterized by its execution time (Abs_Te), period of the process (Period_P) and their 
respective absolute deadlines (Abs_Deadline). Then based on algorithms explained in previous sections the execution 
queue is created for both scheduling policies. Three scaling factors are considered for frequency to illustrate the variation 


in relative deadlines. Algorithm of program to illustrate frequency scaling impact on multitasking system is given below. 
e Initialize execution time, absolute deadline and period of process. 
e Priority list are created for EDF and RM scheduling policies and used as execution queue sequence. 
e Frequency scaling factors are defined as 1.0, 1.5 & 2.0. 
e Relative deadlines are calculated for different scaling factors. 


e The figure is plotted to indicate variation of relative deadlines of process for all scaling factor and compared 


to identify the factor satisfying all deadlines. 


The deadlines defined for the processes are 6, 4, 10, 16, and 12 for five processes respectively. Each process is 
given twice as much time as its execution time as deadline. In soft and firm real time systems the deadlines are slightly 
relaxed to reduce the cost and power consumption. The relative deadline equation defined in chapter two is used for the 
calculations and plotted on the figure for both EDF and RM scheduler with three different frequency scaling factors as 


represented by the legends in each figure. 
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Figure 1: Impact of Frequency Scaling in EDF 


In EDF scheduling, the order of process queue is 2, 1, 3, 5, and 4 assuming all processes are ready for execution. 


Following observations can be depicted from the variation of relative deadline plotted. 


e Scaling factor 1.0: Process 2 meets the deadline where all other processes 1, 3, 4, 5 are missing the required 


deadline. 
e =©Scaling factor 1.5: Process 1, 2, 3, 4 meet the deadline whereas process 5 misses required deadline. 
e = Scaling factor 2.0: Process 1, 2, 3, 4 meet the deadline whereas process 5 misses required deadline. 


Therefore, it can be concluded that if the processes are to be scheduled using EDF policy then system required 
and CPU with higher operating frequency. The higher frequency also indicates higher power consumption. In real time 
systems specially operating on battery backup it’s very important to optimize the consumption of power. This analysis will 


be of great help in such cases to select the appropriate scheduling policies. 


Process IDs Vs Relative Deadline-RM 
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Figure 2: Impact of Frequency Scaling in RM 
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The order of scheduling in RM policy is 1, 3, 2, 4, 5 with all processing are in ready queue of scheduler. The 


following observations can be concluded by the figure 2. 


e Scaling factor 1.0: Process 1 meets the deadline where all other processes 1, 3, 4, 5 are missing the required 


deadline. 
e Scaling factor 1.5: Process 1, 3, 4 meet the deadline whereas process 2, 5 misses required deadline. 
e Scaling factor 2.0: Process 1, 3, 4 meet the deadline whereas process 2, 5 misses required deadline. 


The comparison of the two scheduling policies for the set of processes considered reveals that even if the 
operating frequency is scaled by the factor of 2.0, two processes miss deadline in case of RM scheduling whereas only | 
process misses the deadline in case of EDF policy. If the hardware limitations are imposed and higher frequency CPU is 


not allowed, then EDF scheduling is efficient for the system. 


Estimation of execution time for processes can be done in two different methods. Which helps in designing the 
system with appropriate CPU and hardware resources. As already discussed, if the system is hard real time then system is 
designed assuming maximum number of processes in ready state. In case of firm and soft real time systems the design is 
done with consideration of average number of processes or total periodic processes. There are different methods to 
estimate the execution time and resource requirements. Two simple and effective methods applicable in case of 


commercial real time systems are discussed below. 


In first method, all the programs are converted to equivalent assembly language program and then clock cycles 
required for each instruction is verified. The total number of clock cycles required for program is calculated and multiplied 
with clock period to get the execution time of program. Once the execution time is calculated for all program’s execution 
time is compared with required deadlines. If any program fails to meet the required deadline then CPU can be upgrade to 
higher operating frequency to meet the requirement. However, this method becomes complicated for complex programs 


and if many sub functions are included is the system operation. 
T=)... CPI(i) * Teyete 4 
T = Time required for execution of program 
n = Number of instructions 
CPI = Machine cycle per instruction for i instruction 
Tcycle = Time period of one machine cycle 
Machine cycle period varies with respect to frequency ‘f’ and clock cycles per machine cycles. 


The second method is based on estimation on standard processing unit and then comparing the performance with 
target hardware designed for the system. In this method each program is executed on standard CPU and execution times 
are tabulated and then based on these values the actual execution on target hardware is calculated. This method is very 
simple and used only when the target hardware or CPU is not available during design cycle of system. Mostly used in hard 


real time system for which hardware are specially designed as per requirement of the system. 


The third method is the directly execute the programs on target hardware and observe the execution time under 
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different scenarios and calculate average execution time. This process requires hardware to be known while designing the 


system. This method is most effective and efficient in firm and soft real time system design. 


The third method explained is used in proposed work to calculate the execution time of each process. The time 
required in case of single process execution and time required when process is invoked as function in another process both 


are calculated on Raspberry Pi 3b+ operating at 7OOMHz standard frequency. 


The flow diagram of implementing dynamic frequency adaptation in multitasking environment is shown below, 
the scheduling policies are indicated with common block diagram as they only modify the process execution queue. 
Whereas the entire system operation and decision-making process remains same for any scheduling policy. Hence the 
technique proposed can be used with implementation of any scheduling policies and not limited to the EDF and RM 


policies. 
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